-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: read team token from kafka headers in blobby #18337
Conversation
plugin-server/src/main/ingestion-queues/session-recording/session-recordings-consumer.ts
Outdated
Show resolved
Hide resolved
plugin-server/src/main/ingestion-queues/session-recording/session-recordings-consumer.ts
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great. Was going to comment on how will we know if it works but i guess thats the distinction of the drop_cause
?
@@ -217,7 +217,7 @@ export class SessionManager { | |||
return | |||
} | |||
|
|||
const lagMultiplier = getLagMultipler(partitionLag) | |||
const lagMultiplier = getLagMultiplier(partitionLag) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Multipler is the formal version of the word... 🙈
yep, I made that more explicit so we'll see no more of the existing drop cause and one or other of the new ones 👍 |
|
||
// NOTE: This is simple validation - ideally we should do proper schema based validation | ||
if (event.event !== '$snapshot_items' || !$snapshot_items || !$session_id) { | ||
status.warn('🙈', 'Received non-snapshot message, ignoring') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's increment eventDroppedCounter
too here for visibility. Ideally, I'd love for all drop scenarios to increment this metric.
|
||
return statusWarn('header_token_team_missing_or_disabled', { | ||
token: token, | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have an opportunity to drop here if !teamIdWithConfig.session_recording_opt_in
, don't we?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's implicit in this check. We only load teams that exist and have session_recording_opt_in
set to true.
I guess the assumption was it wasn't valuable enough to know the difference in the event dropped counter to stand the cost of loading more teams...
Recently we read tens of millions of messages knowing the only action was to drop them. For every message we had to parse the body to get the token so we could check if we wanted to drop it.
Let's put the team token into the kafka headers since if there's an incident we always care about those values.
And then we can read the team token from the header and drop the message without parsing it when the team does not have session recording enabled.
This still also reads the body to check for team token as a fallback. If nothing else this lets us deploy this without worrying about rolling out a switcheroo
So
token in headers, team not found - return early (new behaviour)
token in headers, team found - does parse the body, but doesn't check the team again
token not in headers, team not found - return early (as before)
token not in headers, team found - (as before)